System and method for testing for memory errors in a computer system

ABSTRACT

The specification may disclose a computer system that may operate a portion of available memory as backup to a primary memory, and the computer system may be adapted to test the backup memory for memory errors at times other than execution of power-on self-test procedures.

BACKGROUND

[0001] Computer systems may couple to the Internet, wide area networks(WANs) and/or local area networks (LANs). When coupled to networks inthis fashion, a computer system may provide a variety of functions suchas, without limitation, data storage, information handling, messagerouting, and on-line business transactions. Where a particular computersystem may be providing a mission-critical service, such as an orderprocessing for an on-line retailer, the computer system may implementredundancy of its hardware in order to sustain operation in spite ofhardware failures.

[0002] One aspect of a computer system that may utilize duplicate orredundant hardware is main memory. Main memory may provide storage ofdata and instructions used by a processor. Main memory, which maycomprise semiconductor devices, may suffer from correctable errors(i.e., errors that can be corrected and detected). Correctable errorsmay comprise changes in memory values caused by cosmic rays, transienthardware events, and the like. Correction of memory errors may beaccomplished by use of error correction code (ECC) or other technology.However, main memory may also experience uncorrectable errors, which maynot be correctable using ECC technology.

[0003] The occurrence of errors in main memory increases linearly with,memory capacity. Thus, as memory capacity increases, the number ofcorrectable and uncorrectable errors may likewise increase leading to anincreased likelihood that the computer system may fail in some respect(e.g., complete system crash, undesirable system behavior). Computersystems may check main memory for errors during the power-on self test(POST) procedures. During POST procedures, the computer system may haveyet to load an operating system, and thus may have the ability tothoroughly check each memory location within the main memory. Once anoperating system and end-user programs are loaded and being executed, itmay not be possible to check each main memory location for errorswithout unduly affecting operation of the computer or server system. Theinability to check for memory errors in an operational computer systemmay lead to operational failures in unexpected situations.

[0004] Duplicate or redundant main memory may be referred to as“on-line” or “hot-spare” memory. Hot-spare memory thus may refer to mainmemory within a computer system that may be utilized if a primary memoryexperiences an uncorrectable memory failure, or experiences a series ofcorrectable memory failures which may signal an impending uncorrectablememory failure. In this circumstance, the computer system may make thehot-spare memory the primary memory, thus allowing the faulty memory tobe repaired and/or replaced. Some computer systems may have the abilityto accept additional memory while the computer system isoperational—hot-add memory—and such memory may be used immediately, ormay be used as hot-spare memory.

[0005] While hot-spare memory may have been tested during POSTprocedures of the computer system, the memory may not be used in actualoperation for weeks, or even months, and may experience failures priorto use. In the event that the computer system experiences a failure in amemory designated as “primary,” a swap to the hot-spare memory withundetected failures may result in a computer system crash. Hot-addmemory may not have been present in the computer system during POSTprocedures, and thus may not have been tested prior to becoming theprimary memory in the system. Again, this may lead to the possibilitythat a swap to the spare memory may cause a computer system crash.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] For a detailed description of the embodiments of the invention,reference will now be made to the accompanying drawings in which:

[0007]FIG. 1 illustrates a computer system constructed in accordancewith embodiments of the invention;

[0008]FIG. 2 illustrates a memory diagram that exemplifies memorymapping in accordance with embodiments of the invention; and

[0009]FIG. 3 illustrates a flow diagram for memory checking inaccordance with embodiments of the invention.

NOTATION AND NOMENCLATURE

[0010] Certain terms are used throughout the following description andclaims to refer to particular system components. As one skilled in theart will appreciate, computer companies may refer to a component bydifferent names. This document does not intend to distinguish betweencomponents that differ in name but not function.

[0011] In the following discussion and in the claims, the terms“including” and “comprising” are used in an open-ended fashion, and thusshould be interpreted to mean “including, but not limited to.”. Also,the term “couple” or “couples” is intended to mean either an indirect ordirect electrical connection. Thus, if a first device couples to asecond device, that connection may be through a direct electricalconnection, or through an indirect electrical connection via otherdevices and connections.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION

[0012] The following discussion is directed to various embodiments ofthe invention. The embodiments disclosed should not be interpreted orotherwise used as limiting the scope of the disclosure, including theclaims, unless otherwise specified. In addition, one skilled in the artwill understand that the following description has broad application.The discussion of the embodiments is meant only to be exemplary, and notintended to intimate that the scope of the disclosure, including theclaims, is limited to these embodiments.

[0013] Referring initially to FIG. 1, computer system 100 may comprise aplurality of central processing units (CPUs or processors) 2A-D coupledto a main memory, which may be implemented using memory arrays on one ormore memory boards 4, 6. The processors 2 may couple to the memory arrayon each memory board 4, 6 by way of a bridge logic unit 8. In theexemplary computer system 100, the bridge logic unit 8 may be termed a“North Bridge” based on its location in computer system drawings. Theprocessors may be of any, suitable type, such as processors availablefrom Intel Corporation (e.g., Intel® Pentium® 4 XEON processors, or anItanium™ processors), AMD, or Motorola. The processors 2 may couple tothe North Bridge 8 by way of a suitable bus 10, which may be a singlebus (as illustrated), a split bus, or individual buses. The computersystem 100 may implement only a single processor, if desired.

[0014] The memory arrays on memory boards 4, 6 may couple to the NorthBridge 8 through a memory bus 12. North Bridge 8 may comprise a memorycontrol unit (not specifically shown) that controls transactions to themain memory by sending control signals during memory accesses. The mainmemory may function as the working memory for the processors 2 and maycomprise, on each memory board, a conventional memory device or array ofmemory devices in which programs, instructions and data may be stored.Thus, memory boards 4, 6 may each contain a suitable type of memory,such as dynamic random access memory (DRAM) or any other various typesof DRAM devices such as synchronous DRAM (SDRAM), extended data outputDRAM (EDO-DRAM), double-data-rate SDRAM (DDR SDRAM), and the like. In atleast some of the embodiments of the invention, however, the main memoryis implemented using DDR SDRAM packaged in dual in-line memory modules(DIMMs). The memory on each DIMM may be organized into a plurality ofmemory banks, with memory locations within each memory bank accessed ona row and column basis.

[0015] The memory boards 4, 6 may be selectively removed and added tothe computer system without requiring the computer system to shut down.Quick switch device 14 may selectively couple and decouple an addressbus, data bus, control signals, and power signals to each of the memoryboards 4, 6. In particular, prior to removal of one of the memoryboards, the quick switch device 14, possibly in combination with otherdevices, may electrically decouple address, data, control and powersignals from the memory board to be removed. Likewise, after insertionof a memory board into the computer system 100, the quick switch device14, possibly in combination with other devices, may provide power to thenewly inserted memory board, and at appropriate times thereafter, couplethe address bus, data, and control signals to the newly inserted memoryboard. Co-pending patent application Ser. No. 10/179,001 filed Jun. 25,2002 titled “Computer System Architecture with Hot Pluggable Main MemoryBoards,” incorporated by reference herein as if reproduced in fullbelow, may describe computer systems having the ability to accepthot-pluggable main memory boards.

[0016] The computer system 100 illustrated in FIG. 1 may also comprise asecond bridge logic device 16 that bridges a primary expansion bus 18 tovarious secondary expansion buses, such as a low pin count (LPC) bus 20,a read only memory (ROM) bus 22, and a peripheral component interconnect(PCI) bus 24. Although the computer system 100 illustrated in FIG. 1only shows three secondary expansion buses, a variety of suitablesecondary expansion buses may be implemented, such as PCI-X, EISA, AGPand the like. In a fashion similar to the naming of North Bridge 8,bridge device 16 may be referred to as a “South Bridge” based on itslocation in computer system drawings. In at least some embodiments ofthe invention, North Bridge 8 and South Bridge 16 may be part of achipset produced by Server Works, Inc., such as the Grand Champion™ HEChipset. In embodiments utilizing the Grand Champion HE Chipset, theprimary expansion bus 18 may comprise a Thin Intermodule Bus (TIB)(which is a proprietary bus of Server Works, Inc.); however, thecomputer system 100 illustrated in FIG. 1 is not limited to anyparticular type of chipset, and thus the primary expansion bus 18 maycomprise other suitable buses, such as a PCI bus, or a Hublink™ bus(which is a proprietary bus of the Intel Corporation).

[0017] Still referring to FIG. 1, a ROM device 26 may couple, by way ofthe ROM bus 22, to the South Bridge 16. The ROM 26 may store softwareprograms executable by the processors 2. The programs on the ROM 26 maybe basic input/output system (BIOS) commands, stackless code executedduring power-on self-test (POST) procedures, as well as dedicatedprograms that are executed based on the issuance of system managementinterrupts (SMIs) by various computer system devices, discussed morefully below.

[0018] Still referring to FIG. 1, the computer system 100 may alsocomprise an input/output (I/O) controller 28, that may couple to theSouth Bridge 16 by way of the LPC bus 20. The I/O controller 26 maycouple the keyboard 30, a mouse 32, and various other I/O devices (notspecifically shown) to the computer system 100.

[0019] The computer system 100 may further comprise a hard disk or harddrive (HD) 34 that may couple to the South Bridge 16 by way of the PCIbus 24. While only a single hard disk 34 is illustrated in FIG. 1, theremay be multiple hard disks operated as individual storage devices, orpossibly in a redundant array of independent devices (RAID)configuration.

[0020] The computer system 100 may also comprise an advance servermanagement (ASM) device 36. The ASM device 36 may be a microcontrollerprogrammed to perform a variety of server management functions.Alternatively, the ASM device 36 may be an application-specificintegrated circuit (ASIC), again which may be designed and constructedto perform server management functions. The ASM 36 may couple to theSouth Bridge 16, and remaining computer system components, by way of thePCI bus 24. In at least some of the embodiments of the invention, theASM 36 may be responsible for issuing system management interrupts tothe processors 2, possibly by asserting the SMI# signal line 38 coupledbetween the ASM 36 and the processors 2. Upon receiving a SMI, theprocessors 2 may temporarily suspend execution of the operating systemand end-user programs, and execute system management programs (SMIroutines). The system management programs may initially be stored on theROM 26, but may be accessible to the processors 2 in a shadowed ROM areaof the main memory (not specifically shown).

[0021] In computer systems implementing redundant main memory systems,one portion of the available memory, for example all the memory on asingle memory board, may be designated as a “primary” memory for thecomputer system. During the error-free operation of the primary memory,the hot-spare memory may experience a memory failure, which was notpresent or was not detected during the POST procedures. If the computersystem 100 changes the “primary” designation to the hot-spare memorycontaining previously undetected experiencing memory errors, thechange-over may crash the computer system. Embodiments of this inventionthus may periodically test the hot-spare memory, for example the memoryon a hot-spare memory board. Likewise, embodiments of the invention maytest the hot-spare memory just prior to a change of the “primary”designation to the hot-spare. In this way, if the hot-spare memory hasexperienced a memory failure, the computer system 100 may determine thefailure prior to swapping from the “primary” memory, which may only beexperiencing correctable errors, and thus could continue to be used.

[0022] Embodiments of the invention may also have the ability to receivehot-add memory. That is, it may be possible to insert additional memory,for example, in the form of an additional memory board, into thecomputer system while the computer system is operational—hot-add memory.In the hot-add memory case, the newly inserted memory may not have beenpresent during POST procedures, and thus may not have been tested formemory errors. Much like the hot-spare case discussed above, afterinsertion of the hot-add memory, the computer system 100 may test thememory for errors. Once the hot-add memory board has been inserted, itmay be used as a hot-spare memory, and thus embodiments of the inventionalso periodically continue to check the viability of the memory on thenewly inserted board, and also may test just prior to a swap of theprimary designation.

[0023] Embodiments of the invention may implement a thirty-two bitaddress bus. This may mean that a computer system 100 such as thatillustrated in FIG. 1 may only have approximately four gigabytes ofvirtual address space. The virtual address space may be alternativelyreferred to as the directly addressable memory space or directlyaddressable memory area, and may represent the number of uniquecombinations of the states of the thirty-two bit bus, each combinationaddressing one location. At least some of the physical devices withinthe computer system 100 may utilize portions of the virtual addressspace, and as such, the entire virtual address space may not bededicated to addressing main memory. For example, PCI devices such ashard disk 34 may have a series of registers or buffers that facilitatetransfer of data, and these registers or buffers may be assignedaddresses in the virtual address space. Likewise, the BIOS routinesstored on the ROM 26 may be copied to a shadowed ROM area within themain memory, and these programs too may be addressed using addresses inthe virtual address space. Remaining addresses in the virtual addressspace may be assigned to main memory; however, embodiments of theinvention may utilize more main memory than may be directly addressed inthe virtual address space. Embodiments of the invention may implement 16gigabytes, 32 gigabytes, 64 gigabytes, and more, and in these situationsthe entire main memory may not be directly addressed. Using physicaladdress expansion (PAE) mode techniques, which may selectively mapportions (known as pages) of the main memory to the virtual addressspace, it may be possible for the operating system and end user programsto access main memory over that which may be directly addressable.

[0024] In the various embodiments of the invention, the North Bridge 8may be responsible for maintaining memory mapping tables, and the like,to implement the PAE mode used by the operating system and end-userprograms. Memory testing performed in accordance with embodiments of theinvention, however, may be performed in a system management mode (SMImode), as may be triggered by ASM 36 asserting signal line 38, and thusthe testing may be transparent to the operating system. Using SMI modeallows testing of the memory without the help, or intervention, of theoperating system, and thus is transparent to the operating system. Whileit may be possible to utilize PAE mode addressing techniques in thesystem management mode, SMI routines in accordance with embodiments ofthe invention may only address the directly addressable memory area,possibly to keep the SMI routines short, to reduce execution time.

[0025] Referring now to FIG. 2, embodiments of the invention may testfor memory errors in the memory area above four gigabytes by mappingportions, for example portions 52, 54, 56 and 58 into a test region 60in the directly addressable memory 62 (the mapping illustrated by arrow57 with respect to portion 56). Each portion 52, 54, 56 and/or 58 may bemapped, possibly one portion at a time, to the region 60, and thatportion of the memory may be tested for memory errors. The extent towhich the memory is tested for errors may depend on the amount ofprocessor time that may be dedicated to memory testing. In particular,if the computer system is relatively lightly loaded, it may be possibleto perform detailed memory error testing procedures, such as walkingzeros and/or walking ones in each memory location. If, however, thecomputer system is heavily loaded, and/or the amount of memory above thedirectly addressable memory space is large, the amount of time requiredto perform detailed memory tests by the processor in the SMI mode mayadversely affect system performance. In such circumstances, embodimentsof the invention may perform memory tests by less detailed means, forexample by writing patterns to the memory, then reading the memory andcomparing the pattern read to the pattern written. If the pattern readis the same as the pattern written, it may be assumed that the memory isstill functioning correctly. These memory tests are merely exemplary ofpossible memory tests that may be performed, and may be used together orseparately, possibly with other memory tests. Before proceeding itshould be understood that the memory testing procedures described may beperformed after POST procedures have complete, and during run-time ofthe computer system. Thus, the memory testing described is in additionto any memory testing that may take place during POST procedures.

[0026] Embodiments of the invention may implement the mapping and memorytesting technique using a combination of programs executed during systemmanagement mode, as well as a plurality of registers in the North Bridge8. Referring again to FIG. 1, North Bridge 8 may comprise two registers64 and 66. Register 64 may contain an address of the starting locationof the memory region 60, as illustrated by line 68 of FIG. 2. In atleast some of the embodiments of the invention, the memory region 60 maybe of known or predefined size, for example 32 kilo-bytes, and thus byreading the base address of the memory region, the addresses of theentire region may be calculated. Conversely, the base address registermay identify an ending address of the region 60, as illustrated by line69 in FIG. 2. In alternative embodiments, the North Bridge 8 maycomprise an additional register (not shown) which contains an endingaddress of the region 60. In other embodiments of the invention, thestarting address, ending address, or both, may be hard-coded in a memorytesting program, thus negating the need for a register pointing toregion 60.

[0027] North Bridge 8 may also comprise a register 66, which may containa starting address of one of the memory portions in the memory areaabove four gigabytes. In particular, the register 66 may contain thestarting (or alternatively ending) address of the memory portions 52,54, 56 or 58. In embodiments where the size of the memory region 60 maybe predefined, likewise the size of the memory area to be tested may bethe same, and thus by reading the starting (or alternatively ending)address, the address of the entire portion may be calculated. Inalternative embodiments, the North Bridge 8 may contain an additionalregister (not shown) that, used in combination with register 66, maydefine the beginning and ending address of the memory portion.

[0028]FIG. 3 illustrates an exemplary flow diagram for checking formemory errors as may be utilized in accordance with embodiments of theinvention. In particular, the process may start (block 70) and the ASM36 may assert the SMI# signal line 38 (block 72). The ASM 36 may assertthe signal line 38 periodically, for example, based on a pre-set orprogrammed timer, or may assert the signal line 38 responsive to anevent within the computer system, such as insertion of a memory board orfailure of a “primary” memory board. Upon receiving the systemmanagement interrupt, the processors 2 may temporarily suspend executionof the operating system and end-user programs, enter a system managementmode, and execute a memory testing program (SMI routine) (block 74). Thememory testing program, which in some embodiments may be stored on ROM26 and executed from a ROM shadow region in the main memory (notspecifically shown), may read an address of the memory region 60 fromthe register 64, and may likewise read an address for the memory portionto be checked from the register 66 (block 76). Utilizing those twopieces of information, the program may then request the North Bridge 8to map the memory portion (block 77), for example memory portion 52, 54,56 or 58, to the memory region 60, and then the memory testing programmay perform a memory test (block 78).

[0029] As was discussed above, the memory testing complexity may rangefrom simple to complex, depending on the impact to computer systemperformance that may be tolerated. If the memory test reveals no memoryerrors (block 80), the SMI program may update the memory register 66with an address of the next memory portion to be tested (block 82) andthe process may end (block 84). If, however, the program detects amemory error (block 80), a notification is made (block 86), and theprocess continues with updating of the register 66 (block 82). The typeof notification may depend on the memory error encountered. If the SMIprogram identifies correctable errors, a computer system user oradministrator may elect to leave the memory board in place, as thememory may still be operational and allow the computer system tofunction in the event that the primary memory experiences anuncorrectable failure. If the SMI program uncovers uncorrectablefailures, the notification (block 86) may allow the system administratoror computer system user to immediately remove and replace the memoryboard with an operational memory board.

[0030] The above discussion is meant to be illustrative of theprinciples and various embodiments of the present invention. Numerousvariations and modifications will become apparent to those skilled inthe art once the above disclosure is fully appreciated. For example, thevarious embodiments of the invention discuss hot-spare memoryimplemented on a board-to-board basis; however, the hot-spare conceptneed not be implemented across multiple memory boards. That is, it maybe possible to implement the hot-spare functionality across multipleDIMMs on a single memory board, or from bank-to-bank on a single DIMM.Thus, a bank operated as a hot-spare to a “primary” bank may resideeither in the same DIMM, on a different DIMM yet still on the samememory board, or within a DIMM on another memory board. Further, testingof the hot-spare memory for memory errors of the embodiments of theinvention discussed herein may use SMI routines in the system managementmode, and as such the testing of the memory may be transparent to theoperating system and end-user programs; however, it may be possible toprogram an end-user program to perform the testing, but such a systemwould not be transparent to the operating system, and may be operatingsystem platform specific. It is intended that the following claims beinterpreted to embrace all such variations and modifications.

What is claimed is:
 1. A method comprising: operating a computer systemcomprising a first bank of memory utilized, at least in part, as primarymemory of the computer system, and the computer system also comprising asecond bank of memory utilized as a backup to the first bank of memory;and testing the second bank of memory for memory errors after power-onself test procedures have completed, and during run-time.
 2. The methodas defined in claim 1 wherein testing the second bank of memory furthercomprises testing for memory errors in a region outside a directlyaddressable memory space.
 3. The method as defined in claim 2 whereintesting the second bank of memory further comprises: mapping a testportion of the memory on the second bank of memory to a memory testregion within the directly addressable memory space; and checking forbit errors on the test portion.
 4. The method as defined in claim 3wherein mapping further comprising: allocating a portion of the directlyaddressable memory space to the memory test region; mapping the testportion to the memory test region; and repeating the mapping the testportion to the memory test region step for a plurality of portions ofthe memory on the second bank of memory.
 5. The method as defined inclaim 3 wherein checking for bit errors further comprises: writing a bitpattern to the test portion; reading a bit pattern from the testportion; and verifying the bit pattern read matches the bit patternwritten.
 6. The method as defined in claim 1 wherein the testing stepfurther comprises: issuing a system management interrupt (SMI) signal;and executing a SMI routine to perform the testing step.
 7. The methodas defined in claim 6 further comprising: reading a first register bythe SMI routine, the first register identifies a test portion of thememory on the second bank of memory; mapping the test portion to thedirectly addressable memory space; and checking for bit errors in thetest portion.
 8. The method as defined in claim 7 further comprising:reading a second register by the SMI routine, the second registeridentifies a memory test region in a directly addressable memory spaceof the computer system; and wherein the mapping the test portion to thedirectly addressable memory space further comprises mapping the testportion to the memory test region.
 9. The method as defined in claim 8wherein each of the reading the register steps further comprisingreading the first and second registers contained in a bridge logicdevice.
 10. The method as defined in claim 7 further comprising updatingthe first register with an address of a next test portion.
 11. Themethod as defined in claim 1 wherein the testing step takes place afterthe second bank of memory is installed in the computer system while thecomputer system is operational.
 12. The method as defined in claim 1wherein the testing step takes place after an error occurs on the firstbank of memory, and prior to the memory of the second bank of memorybeing used as the primary memory.
 13. A computer system comprising: aprocessor; a first memory bank and a second memory bank, the firstmemory bank used as a primary memory, and the second memory bank used asa hot-spare memory; a first bridge logic coupling the processor to thefirst and second memory banks; a second bridge logic coupled to thefirst bridge logic by way of a primary expansion bus; a disk drivestorage device coupled to the second bridge logic by way of a secondaryexpansion bus; and wherein the computer system is adapted to test formemory errors in the second memory bank after power-on self testprocedures, and during run-time of the computer system.
 14. The computersystem as defined in claim 13 wherein the computer system is furtheradapted to test for memory errors in the second memory bank where memoryof the second memory bank is outside the directly addressable memoryarea.
 15. The computer system as defined in claim 13 further comprising:a system management device coupled to the processor; and wherein thesystem management device periodically generates a system managementinterrupt (SMI) to the processor that invokes a SMI routine that testsfor memory errors.
 16. The computer system as defined in claim 15further comprising: said first bridge logic comprising a test addressregister; and wherein the SMI routine reads the test address register,and maps a portion of the memory of the second memory bank identified bythe test address register to a test memory area in the directlyaddressable memory area.
 17. The computer system as defined in claim 16further comprising: said first bridge logic comprising a base addressregister; and wherein the SMI routine reads the base address registerfrom the first bridge logic device to identify the test memory area. 18.The computer system as defined in claim 13 wherein the second memorybank is installed in the computer system while the computer system isoperational.
 19. The computer system as defined in claim 13 furthercomprising the computer system adapted to test for memory errors in thesecond memory bank prior to making the memory of the second memory bankthe primary memory.
 20. The computer system as defined in claim 13wherein the first and second memory banks reside on a same memory board.21. The computer system as defined in claim 13 wherein the first memorybank resides on a first memory board, and the second memory bank resideson a second memory board.
 22. A read only memory device storinginstructions executable by a processor in a computer system, theinstructions comprising: reading a test register to determine an addressof memory to be tested for errors; mapping the memory to a directlyaddressable memory area; checking the memory for errors; and updatingthe register with an address of a next portion of memory to be testedfor errors.
 23. The read only memory device as defined in claim 22wherein the instructions further comprising reading a base addressregister to determine a location in the directly addressable addressspace to which to map the memory to be tested.
 24. A computer systemcomprising: a means for executing programs; a host bus coupled to themeans for executing; a first and second means for storing programs anddata, the second means for storing utilized as a backup for the firstmeans for storing; a memory bus coupled to the first and second meansfor storing; a primary expansion bus; a first means for translating buscommunication protocols coupling the memory bus, the host bus and theprimary expansion bus; a secondary expansion bus; a means for long termstorage of programs and data coupling the secondary expansion bus; asecond means for translating bus communication protocols coupled to theprimary and secondary expansion buses; wherein the computer system isadapted to check for storage errors in the second means for storingafter boot procedures, and during run-time of the computer system. 25.The computer system as defined in claim 24 wherein the computer systemis further adapted to check for storage errors in the second means forstoring in a region outside a directly addressable space.
 26. Thecomputer system as defined in claim 25 further comprising: a means forgenerating system management interrupts coupled to the means forexecuting; and wherein the means for generating periodically generates asystem management interrupt (SMI) to the means for executing, the SMIinvokes a SMI routine that tests for storage errors on the second meansfor storing.
 27. The computer system as defined in claim 26 furthercomprising: said first means for translating comprising a test addressregister; and wherein the SMI routine reads the test address registerfrom the first means for translating, and maps a portion of a storagearea of the second means for storing identified by the test addressregister to a memory area in the directly addressable space to performthe testing for storage errors.
 28. The computer system as defined inclaim 24 wherein the second means for storing is installed in thecomputer system while the computer system is operational.
 29. Thecomputer system as defined in claim 24 further comprising the computersystem adapted to check for storage errors in the second means forstoring prior to making the second means for storing a primary means forstoring for the computer system.